973 research outputs found
Character-level Convolutional Networks for Text Classification
This article offers an empirical exploration on the use of character-level
convolutional networks (ConvNets) for text classification. We constructed
several large-scale datasets to show that character-level convolutional
networks could achieve state-of-the-art or competitive results. Comparisons are
offered against traditional models such as bag of words, n-grams and their
TFIDF variants, and deep learning models such as word-based ConvNets and
recurrent neural networks.Comment: An early version of this work entitled "Text Understanding from
Scratch" was posted in Feb 2015 as arXiv:1502.01710. The present paper has
considerably more experimental results and a rewritten introduction, Advances
in Neural Information Processing Systems 28 (NIPS 2015
Optimal dual martingales, their analysis and application to new algorithms for Bermudan products
In this paper we introduce and study the concept of optimal and surely
optimal dual martingales in the context of dual valuation of Bermudan options,
and outline the development of new algorithms in this context. We provide a
characterization theorem, a theorem which gives conditions for a martingale to
be surely optimal, and a stability theorem concerning martingales which are
near to be surely optimal in a sense. Guided by these results we develop a
framework of backward algorithms for constructing such a martingale. In turn
this martingale may then be utilized for computing an upper bound of the
Bermudan product. The methodology is pure dual in the sense that it doesn't
require certain input approximations to the Snell envelope. In an It\^o-L\'evy
environment we outline a particular regression based backward algorithm which
allows for computing dual upper bounds without nested Monte Carlo simulation.
Moreover, as a by-product this algorithm also provides approximations to the
continuation values of the product, which in turn determine a stopping policy.
Hence, we may obtain lower bounds at the same time. In a first numerical study
we demonstrate the backward dual regression algorithm in a Wiener environment
at well known benchmark examples. It turns out that the method is at least
comparable to the one in Belomestny et. al. (2009) regarding accuracy, but
regarding computational robustness there are even several advantages.Comment: This paper is an extended version of Schoenmakers and Huang, "Optimal
dual martingales and their stability; fast evaluation of Bermudan products
via dual backward regression", WIAS Preprint 157
Attention-Based End-to-End Speech Recognition on Voice Search
Recently, there has been a growing interest in end-to-end speech recognition
that directly transcribes speech to text without any predefined alignments. In
this paper, we explore the use of attention-based encoder-decoder model for
Mandarin speech recognition on a voice search task. Previous attempts have
shown that applying attention-based encoder-decoder to Mandarin speech
recognition was quite difficult due to the logographic orthography of Mandarin,
the large vocabulary and the conditional dependency of the attention model. In
this paper, we use character embedding to deal with the large vocabulary.
Several tricks are used for effective model training, including L2
regularization, Gaussian weight noise and frame skipping. We compare two
attention mechanisms and use attention smoothing to cover long context in the
attention model. Taken together, these tricks allow us to finally achieve a
character error rate (CER) of 3.58% and a sentence error rate (SER) of 7.43% on
the MiTV voice search dataset. While together with a trigram language model,
CER and SER reach 2.81% and 5.77%, respectively
- …